Predictive Tagging of Social Media Images using Unsupervised Learning

نویسندگان

  • Nishchol Mishra
  • Sanjay Silakari
چکیده

The popularity of online social media has provided a huge repository of multimedia contents. To effectively retrieve and store this multimedia content and to mine useful pattern from this data is a herculean task. This paper deals with the problems of social image tagging. Multimedia tagging i.e. assigning tags or some keywords to multimedia contents like images, audio, video etc. by users is reshaping the way the people generally search multimedia resources. This huge amount of data must be effectively mined and knowledge is discovered to find some useful patterns hidden in it. Some facts like Facebook which has more than one billion active users, and millions of photos are uploaded daily, YouTube has 490 million unique users who visit every month, People upload 3,000 images to Flickr (the photo sharing social media site) every minute, Flickr hosts over 5 billion images. Apart from their usage for general purpose search, they are also leading towards many diverse areas of research like land mark recognition, tag recommendation, tag relevancy, automatic image tagging or annotation. This paper addresses the problem of automatic image tagging or Predictive tagging of digital images in social network scenario. Predictive tagging aims to automatically predict tags and check the relevancy of tags associated with images. This can be accomplished by using unsupervised learning. General Terms Automatic image annotation, content based image retrieval, image annotation, image mining and multimedia data mining.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bag-of-Features Tagging Approach for a Better Recommendation with Social Big Data

The interests of users are always important for personalized content recommendations on friendships, events and media content from the social big data. However, those interests may not be specified, which makes the recommendations challenging. One of the possible solutions is to analyze the user’s interests from the shared content, especially images with manually annotated tags. They are shared...

متن کامل

LTL-UDE $@$ EmpiriST 2015: Tokenization and PoS Tagging of Social Media Text

We present a detailed description of our submission to the EmpiriST shared task 2015 for tokenization and part-of-speech tagging of German social media text. As relatively little training data is provided, neither tokenization nor PoS tagging can be learned from the data alone. For tokenization, our system uses regular expressions for general cases and word lists for exceptions. For PoS tagging...

متن کامل

Unsupervised Part-of-Speech Tagging in Noisy and Esoteric Domains With a Syntactic-Semantic Bayesian HMM

Unsupervised part-of-speech (POS) tagging has recently been shown to greatly benefit from Bayesian approaches where HMM parameters are integrated out, leading to significant increases in tagging accuracy. These improvements in unsupervised methods are important especially in specialized social media domains such as Twitter where little training data is available. Here, we take the Bayesian appr...

متن کامل

Similarity measurement for describe user images in social media

Online social networks like Instagram are places for communication. Also, these media produce rich metadata which are useful for further analysis in many fields including health and cognitive science. Many researchers are using these metadata like hashtags, images, etc. to detect patterns of user activities. However, there are several serious ambiguities like how much reliable are these informa...

متن کامل

Automatic Normalization of Word Variations in Code-Mixed Social Media Text

Social media platforms such as Twitter and Facebook are becoming popular in multilingual societies. This trend induces portmanteau of South Asian languages with English. The blend of multiple languages as code-mixed data has recently become popular in research communities for various NLP tasks. Code-mixed data consist of anomalies such as grammatical errors and spelling variations. In this pape...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013